Version: Next

Classification - HELOC Credit Risk

Predicting credit risk for Home Equity Line of Credit applications using the FICO HELOC dataset.

Dataset Source: FICO HELOC Dataset Problem Type: Classification
Target Variable: RiskPerformance - Whether applicant will pay as negotiated (Good/Bad) Use Case: Credit risk assessment for financial institutions to identify borrowers at risk of defaulting

Package Imports

Install and import relevant packages

!pip install xplainable
!pip install xplainable-client

import pandas as pd
from sklearn.model_selection import train_test_split
import requests
import json

import xplainable as xp
from xplainable.core.models import XClassifier
from xplainable.core.optimisation.bayesian import XParamOptimiser
from xplainable.preprocessing.pipeline import XPipeline
from xplainable.preprocessing import transformers as xtf

# New refactored client import
from xplainable_client.client.client import XplainableClient
from xplainable_client.client.base import XplainableAPIError

Data Loading and Exploration

Load the HELOC dataset and explore its structure

# Load dataset
data = pd.read_csv('https://xplainable-public-storage.syd1.digitaloceanspaces.com/example_data/heloc_dataset.csv')

# Display basic information
print(f"Dataset shape: {data.shape}")
print(f"Target distribution:\n{data['RiskPerformance'].value_counts()}")
data.head()

Where the defition of each of the fields are below:

Variable Names	Description
RiskPerformance	Paid as negotiated flag (12-36 Months). String of Good and Bad
ExternalRiskEstimate	Consolidated version of risk markers
MSinceOldestTradeOpen	Months Since Oldest Trade Open
MSinceMostRecentTradeOpen	Months Since Most Recent Trade Open
AverageMInFile	Average Months in File
NumSatisfactoryTrades	Number of Satisfactory Trades
NumTrades60Ever2DerogPubRec	Number of Trades 60+ Ever
NumTrades90Ever2DerogPubRec	Number of Trades 90+ Ever
PercentTradesNeverDelq	Percent of Trades Never Delinquent
MSinceMostRecentDelq	Months Since Most Recent Delinquency
MaxDelq2PublicRecLast12M	Max Delinquency/Public Records in the Last 12 Months. See tab 'MaxDelq' for each category
MaxDelqEver	Max Delinquency Ever. See tab 'MaxDelq' for each category
NumTotalTrades	Number of Total Trades (total number of credit accounts)
NumTradesOpeninLast12M	Number of Trades Open in Last 12 Months
PercentInstallTrades	Percent of Installment Trades
MSinceMostRecentInqexcl7days	Months Since Most Recent Inquiry excluding the last 7 days
NumInqLast6M	Number of Inquiries in the Last 6 Months
NumInqLast6Mexcl7days	Number of Inquiries in the Last 6 Months excluding the last 7 days. Excluding the last 7 days removes inquiries that are likely due to price comparison shopping.
NetFractionRevolvingBurden	This is the revolving balance divided by the credit limit
NetFractionInstallBurden	This is the installment balance divided by the original loan amount
NumRevolvingTradesWBalance	Number of Revolving Trades with Balance
NumInstallTradesWBalance	Number of Installment Trades with Balance
NumBank2NatlTradesWHighUtilization	Number of Bank/National Trades with high utilization ratio
PercentTradesWBalance	Percent of Trades with Balance

1. Data Preprocessing

Prepare features and target variable

y = data['RiskPerformance']
x = data.drop('RiskPerformance',axis=1)

Create Train/Test Split

X, y = data.drop(columns=['RiskPerformance']), data['RiskPerformance']

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

2. Model Optimization

The XParamOptimiser fine-tunes the hyperparameters of our model to achieve optimal performance.

opt = XParamOptimiser(metric='f1-score', n_trials=300, n_folds=2, early_stopping=150)
params = opt.optimise(X_train, y_train)

3. Model Training

Train the XClassifier with optimized parameters.

model = XClassifier(**params)
model.fit(X_train, y_train)

4. Model Interpretability and Explainability

Generate insights into the model's decision-making process and understand feature importance.

model.explain()

Analysing Feature Importances and Contributions

Click on the bars to see the importances and contributions of each variable.

Feature Importances

The relative significance of each feature (or input variable) in making predictions. It indicates how much each feature contributes to the model’s predictions, with higher values implying greater influence.

Feature Significance

The effect of each feature on individual predictions. For instance, in this model, feature contributions would show how each feature (like the net fraction of trades revolving burden) affects the predicted risk estimate for a particular applicant.

5. Model Persistence

Save the model to Xplainable Cloud for collaboration and deployment.

In this step, we first create a unique identifier for our HELOC risk prediction model using client.create_model_id. This identifier, referred to as model_id, represents the newly created model that predicts the likelihood of applicants defaulting on their line of credit. After creating this model identifier, we generate a specific version of the model using client.create_model_version, passing in our training data. The resulting version_id represents this particular iteration of our model, allowing us to track and manage different versions systematically.

Xplainable Cloud Setup

Heloc Deployment

# Initialize Xplainable Cloud client using new refactored client
client = XplainableClient(
    api_key="",  # Add your API key from https://platform.xplainable.io/
    hostname="https://platform.xplainable.io"  # Optional, defaults to production
)

# Create a model using the new client's models service
try:
    model_id, version_id = client.models.create_model(
        model=model,
        model_name="HELOC Credit Risk Model",
        model_description="Predicting applicant credit risk for HELOC applications",
        x=X_train,
        y=y_train
    )
    print(f"Model created successfully!")
    print(f"Model ID: {model_id}")
    print(f"Version ID: {version_id}")
except XplainableAPIError as e:
    print(f"Error creating model: {e.message}")
    model_id, version_id = None, None

Heloc Deployment

6. Model Deployment

Deploy the model for real-time predictions.

The code block illustrates the deployment of our credit risk prediction model using the client.deployments.deploy function. The deployment process involves specifying the unique model_version_id that we obtained in the previous steps. This step effectively activates the model's endpoint, allowing it to receive and process prediction requests. The deployment response confirms the successful deployment with a deployment_id and other relevant information.

model_id

Out:

{'model_id': 'TVCwjtghAkCR8KSQ', 'version_id': 'RbYBRcTBfLuyTYUF'}

print(f"Model ID: {model_id}")
print(f"Version ID: {version_id}")

# Deploy the model using the new client's deployments service
try:
    deployment_response = client.deployments.deploy(model_version_id=version_id)
    deployment_id = deployment_response.deployment_id
    print(f"Model deployed successfully!")
    print(f"Deployment ID: {deployment_id}")
except XplainableAPIError as e:
    print(f"Error deploying model: {e.message}")
    deployment_id = None

print(f"Deployment ID: {deployment_id}")

Activating the Deployment: The model deployment is activated using client.activate_deployment, which changes the deployment status to active, allowing it to accept prediction requests.

# Activate deployment using the new client
try:
    client.deployments.activate_deployment(deployment_id)
    print("Deployment activated successfully!")
except XplainableAPIError as e:
    print(f"Error activating deployment: {e.message}")

Out:

{'message': 'activated deployment'}

# Generate deployment key using the new client
try:
    deploy_key = client.deployments.generate_deploy_key(
        deployment_id=deployment_id,
        description='HELOC Deploy Key',
        days_until_expiry=7
    )
    print(f"Deployment key generated: {str(deploy_key)[:20]}...")
except XplainableAPIError as e:
    print(f"Error generating deploy key: {e.message}")
    deploy_key = None

Out:

<Response [200]>

#Set the option to highlight multiple ways of creating data
option = 2

if option == 1:
    # Generate example payload using the new client
    try:
        body = client.deployments.generate_example_deployment_payload(deployment_id)
    except:
        body = json.loads(data.drop(columns=["RiskPerformance"]).sample(1).to_json(orient="records"))
else:
    body = json.loads(data.drop(columns=["RiskPerformance"]).sample(1).to_json(orient="records"))

Making a Prediction Request: A POST request is made to the model's prediction endpoint with the example payload. The model processes the input data and returns a prediction response, which includes the predicted class (e.g., 'No' for no churn) and the prediction probabilities for each class.

# Make prediction request
if deploy_key:
    response = requests.post(
        url="https://inference.xplainable.io/v1/predict",
        headers={'api_key': str(deploy_key)},  # Convert deploy_key to string
        json=body
    )

    value = response.json()
    print("Prediction response:")
    print(value)
else:
    print("Deploy key not available, skipping prediction test")

Out:

[{'index': 0,

'id': None,

'partition': '__dataset__',

'score': 0.5011964337369303,

'proba': None,

'pred': 'Good',

'support': None,

'breakdown': [{'feature': 'base_value',

'value': None,

'score': 0.4780686028445082},

{'feature': 'ExternalRiskEstimate',

'value': '68',

'score': -0.010494705670801262},

{'feature': 'MSinceOldestTradeOpen',

'value': '156',

'score': 0.002339879609304599},

{'feature': 'MSinceMostRecentTradeOpen',

'value': '4',

'score': -0.00047711791608256044},

{'feature': 'AverageMInFile',

'value': '75',

'score': 0.0020254337303166397},

{'feature': 'NumSatisfactoryTrades',

'value': '31',

'score': 0.0023500082246000567},

{'feature': 'NumTrades60Ever2DerogPubRec',

'value': '0',

'score': 0.011256220898831106},

{'feature': 'NumTrades90Ever2DerogPubRec',

'value': '0',

'score': 0.007727365694721391},

{'feature': 'PercentTradesNeverDelq',

'value': '94',

'score': -0.004994618505281867},

{'feature': 'MSinceMostRecentDelq',

'value': '12',

'score': -0.019030421059650676},

{'feature': 'MaxDelq2PublicRecLast12M',

'value': '6',

'score': -0.003371928729410296},

{'feature': 'MaxDelqEver', 'value': '6', 'score': -0.008044595168283326},

{'feature': 'NumTotalTrades', 'value': '35', 'score': 0.012728462697346309},

{'feature': 'NumTradesOpeninLast12M',

'value': '1',

'score': 0.00315905436830356},

{'feature': 'PercentInstallTrades',

'value': '37',

'score': 0.0030159164487750406},

{'feature': 'MSinceMostRecentInqexcl7days',

'value': '-7',

'score': -0.010597293564717137},

{'feature': 'NumInqLast6M', 'value': '0', 'score': 0.011828849647445355},

{'feature': 'NumInqLast6Mexcl7days',

'value': '0',

'score': 0.011041977442703738},

{'feature': 'NetFractionRevolvingBurden',

'value': '41',

'score': -0.00766693504377547},

{'feature': 'NetFractionInstallBurden',

'value': '85',

'score': -0.006728467309717369},

{'feature': 'NumRevolvingTradesWBalance',

'value': '5',

'score': -0.0022589957615032513},

{'feature': 'NumInstallTradesWBalance',

'value': '2',

'score': 0.001922561631456002},

{'feature': 'NumBank2NatlTradesWHighUtilization',

'value': '0',

'score': 0.025646402630966857},

{'feature': 'PercentTradesWBalance',

'value': '70',

'score': 0.0017507765968747127}]}]

Heloc Deployment

Classification - HELOC Credit Risk

Package Imports​

Install and import relevant packages

Data Loading and Exploration​

1. Data Preprocessing​

Prepare features and target variable​

Create Train/Test Split​

2. Model Optimization​

3. Model Training​

4. Model Interpretability and Explainability​

Analysing Feature Importances and Contributions​

Feature Importances​

Feature Significance​

5. Model Persistence​

Xplainable Cloud Setup​

6. Model Deployment​

Package Imports

Data Loading and Exploration

1. Data Preprocessing

Prepare features and target variable

Create Train/Test Split

2. Model Optimization

3. Model Training

4. Model Interpretability and Explainability

Analysing Feature Importances and Contributions

Feature Importances

Feature Significance

5. Model Persistence

Xplainable Cloud Setup

6. Model Deployment